{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# OneRClassifier: One Rule (OneR) method for classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And implementation of the One Rule (OneR) method for classification."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> from mlxtend.classifier import OneRClassifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\"OneR\" stands for One Rule (by Robert Holte [1]), which is a classic algorithm for supervised learning. Note that this algorithm is not known for its good prediction performance; thus, it is rather recommended for teaching purposes and for lower-bound performance baselines in real-world applications.\n",
    "\n",
    "The name \"OneRule\" can be a bit misleading, because it is technically about \"one feature\" and not about \"one rule.\" I.e., OneR returns a feature for which one or more decision rules are defined. Essentially, as a simple classifier, it finds exactly one feature (and one or more feature values for that feature) to classify data instances.\n",
    "\n",
    "The basic procedure is as follows:\n",
    "\n",
    "- For each feature among all features (columns) in the dataset:\n",
    "    - For each feature value for the given feature: \n",
    "        1. Obtain the training examples with that feature value.\n",
    "        2. Obtain the class labels (and class label counts) corresponding to the training examples identified in the previous step.\n",
    "        3. Regard the class label with the highest frequency (count) as the majority class.\n",
    "        4. Record the number of errors as the number of training examples that have the given feature value but are not the majority class.\n",
    "    - Compute the error of the feature by summing the errors for all possible feature values for that feature.\n",
    "- Return the best feature, which is defined as the feature with the lowest error.\n",
    "\n",
    "\n",
    "Please note that the OneR algorithm assumes categorical (or discretized) feature values. A nice explanation and OneR classifier can be found in the Interpretable Machine Learning online chapter \"4.5.1 Learn Rules from a Single Feature (OneR)\" (https://christophm.github.io/interpretable-ml-book/rules.html, [2])."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### References\n",
    "\n",
    "[1] Holte, Robert C. \"Very simple classification rules perform well on most commonly used datasets.\" Machine learning 11.1 (1993): 63-90.\n",
    "\n",
    "[2] Interpretable Machine Learning (2018) by Christoph Molnar: https://christophm.github.io/interpretable-ml-book/rules.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 1 -- Demonstrating OneR on a discretized Iris dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As mentioned in the overview text above, the OneR algorithm expects categorical or discretized features. The `OneRClassifier` implementation in MLxtend does not modify features in the dataset, and ensuring that the features are categorical is a responsibility of the user.\n",
    "\n",
    "In the following example, we will discretize the Iris dataset. In particular, we will convert the dataset into quartiles. In other words each feature value gets replaced by a categorical value. For sepal width (the first column in Iris), this would be\n",
    "\n",
    "- (0, 5.1] => 0\n",
    "- (5.1, 5.8] => 1\n",
    "- (5.8, 6.4] => 2\n",
    "- (6,4, 7.9] => 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below are the first 15 rows (flowers) of the original Iris data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[5.1, 3.5, 1.4, 0.2],\n",
       "       [4.9, 3. , 1.4, 0.2],\n",
       "       [4.7, 3.2, 1.3, 0.2],\n",
       "       [4.6, 3.1, 1.5, 0.2],\n",
       "       [5. , 3.6, 1.4, 0.2],\n",
       "       [5.4, 3.9, 1.7, 0.4],\n",
       "       [4.6, 3.4, 1.4, 0.3],\n",
       "       [5. , 3.4, 1.5, 0.2],\n",
       "       [4.4, 2.9, 1.4, 0.2],\n",
       "       [4.9, 3.1, 1.5, 0.1],\n",
       "       [5.4, 3.7, 1.5, 0.2],\n",
       "       [4.8, 3.4, 1.6, 0.2],\n",
       "       [4.8, 3. , 1.4, 0.1],\n",
       "       [4.3, 3. , 1.1, 0.1],\n",
       "       [5.8, 4. , 1.2, 0.2]])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from mlxtend.data import iris_data\n",
    "\n",
    "\n",
    "X, y = iris_data()\n",
    "X[:15]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is the discretized dataset. Each feature is divided into 4 quartiles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 3, 0, 0],\n",
       "       [0, 1, 0, 0],\n",
       "       [0, 2, 0, 0],\n",
       "       [0, 2, 0, 0],\n",
       "       [0, 3, 0, 0],\n",
       "       [1, 3, 1, 1],\n",
       "       [0, 3, 0, 0],\n",
       "       [0, 3, 0, 0],\n",
       "       [0, 1, 0, 0],\n",
       "       [0, 2, 0, 0],\n",
       "       [1, 3, 0, 0],\n",
       "       [0, 3, 0, 0],\n",
       "       [0, 1, 0, 0],\n",
       "       [0, 1, 0, 0],\n",
       "       [1, 3, 0, 0]])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "\n",
    "def get_feature_quartiles(X):\n",
    "    X_discretized = X.copy()\n",
    "    for col in range(X.shape[1]):\n",
    "        for q, class_label in zip([1.0, 0.75, 0.5, 0.25], [3, 2, 1, 0]):\n",
    "            threshold = np.quantile(X[:, col], q=q)\n",
    "            X_discretized[X[:, col] <= threshold, col] = class_label\n",
    "    return X_discretized.astype(np.int)\n",
    "\n",
    "Xd = get_feature_quartiles(X)\n",
    "Xd[:15]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given a dataset with categorical features, we can use the OneR classifier like similar to a scikit-learn estimator for classification. First, let's divide the dataset into training and test data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "\n",
    "Xd_train, Xd_test, y_train, y_test = train_test_split(Xd, y, random_state=0, stratify=y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can train a `OneRClassifier` model on the training set using the `fit` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mlxtend.classifier import OneRClassifier\n",
    "oner = OneRClassifier()\n",
    "\n",
    "oner.fit(Xd_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The column index of the selected feature is accessible via the `feature_idx_` attribute after model fitting:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "oner.feature_idx_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is also a `prediction_dict_` available after model fitting. It lists the total error for the selected feature (i.e., the feature listed under `feature_idx_`). Also it provides the classification rules:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'total error': 16, 'rules (value: class)': {0: 0, 1: 1, 2: 1, 3: 2}}"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "oner.prediction_dict_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I.e., ` 'rules (value: class)': {0: 0, 1: 1, 2: 1, 3: 2}` means that there are 3 rules for the selected feature (petal length):\n",
    "- if value is 0, classify as 0 (Iris-setosa)\n",
    "- if value is 1, classify as 1 (Iris-versicolor)\n",
    "- if value is 2, classify as 1 (Iris-versicolor)\n",
    "- if value is 3, classify as 2 (Iris-virginica)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After model fitting we can use the `oner` object for making predictions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 2, 2, 1,\n",
       "       0, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 0, 0, 1, 2, 1, 1, 2, 2, 1, 0, 1,\n",
       "       1, 1, 2, 0, 1, 2, 1, 2, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 2, 0, 1, 1,\n",
       "       0, 1, 2, 1, 2, 0, 1, 2, 1, 1, 2, 0, 1, 0, 0, 1, 1, 2, 0, 0, 0, 1,\n",
       "       0, 1, 2, 2, 2, 0, 1, 0, 2, 0, 1, 1, 1, 1, 0, 2, 2, 0, 1, 1, 0, 2,\n",
       "       1, 2])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "oner.predict(Xd_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training accuracy 85.71%\n"
     ]
    }
   ],
   "source": [
    "y_pred = oner.predict(Xd_train)\n",
    "train_acc = np.mean(y_pred == y_train)  \n",
    "print(f'Training accuracy {train_acc*100:.2f}%')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test accuracy 84.21%\n"
     ]
    }
   ],
   "source": [
    "y_pred = oner.predict(Xd_test)\n",
    "test_acc = np.mean(y_pred == y_test)  \n",
    "print(f'Test accuracy {test_acc*100:.2f}%')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead of computing the prediction accuracy manually as shown above, we can also use the `score` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test accuracy 84.21%\n"
     ]
    }
   ],
   "source": [
    "test_acc = oner.score(Xd_test, y_test)\n",
    "print(f'Test accuracy {test_acc*100:.2f}%')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# API"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "## OneRClassifier\n",
      "\n",
      "*OneRClassifier(resolve_ties='first')*\n",
      "\n",
      "OneR (One Rule) Classifier.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `resolve_ties` : str (default: 'first')\n",
      "\n",
      "    Option for how to resolve ties if two or more features\n",
      "    have the same error. Options are\n",
      "    - 'first' (default): chooses first feature in the list, i.e.,\n",
      "    feature with the lower column index.\n",
      "    - 'chi-squared': performs a chi-squared test for each feature\n",
      "    against the target and selects the feature with the lowest p-value.\n",
      "\n",
      "**Attributes**\n",
      "\n",
      "- `self.classes_labels_` : array-like, shape = [n_labels]\n",
      "\n",
      "    Array containing the unique class labels found in the\n",
      "    training set.\n",
      "\n",
      "\n",
      "- `self.feature_idx_` : int\n",
      "\n",
      "    The index of the rules' feature based on the column in\n",
      "    the training set.\n",
      "\n",
      "\n",
      "- `self.p_value_` : float\n",
      "\n",
      "    The p value for a given feature. Only available after calling `fit`\n",
      "    when the OneR attribute `resolve_ties = 'chi-squared'` is set.\n",
      "\n",
      "\n",
      "- `self.prediction_dict_` : dict\n",
      "\n",
      "    Dictionary containing information about the\n",
      "    feature's (self.feature_idx_)\n",
      "    rules and total error. E.g.,\n",
      "    `{'total error': 37, 'rules (value: class)': {0: 0, 1: 2}}`\n",
      "    means the total error is 37, and the rules are\n",
      "    \"if feature value == 0 classify as 0\"\n",
      "    and \"if feature value == 1 classify as 2\".\n",
      "    (And classify as class 1 otherwise.)\n",
      "\n",
      "    For usage examples, please see\n",
      "    [https://rasbt.github.io/mlxtend/user_guide/classifier/OneRClassifier/](https://rasbt.github.io/mlxtend/user_guide/classifier/OneRClassifier/)\n",
      "\n",
      "### Methods\n",
      "\n",
      "<hr>\n",
      "\n",
      "*fit(X, y)*\n",
      "\n",
      "Learn rule from training data.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `X` : {array-like, sparse matrix}, shape = [n_samples, n_features]\n",
      "\n",
      "    Training vectors, where n_samples is the number of samples and\n",
      "    n_features is the number of features.\n",
      "\n",
      "\n",
      "- `y` : array-like, shape = [n_samples]\n",
      "\n",
      "    Target values.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `self` : object\n",
      "\n",
      "\n",
      "<hr>\n",
      "\n",
      "*get_params(deep=True)*\n",
      "\n",
      "Get parameters for this estimator.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `deep` : bool, default=True\n",
      "\n",
      "    If True, will return the parameters for this estimator and\n",
      "    contained subobjects that are estimators.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `params` : mapping of string to any\n",
      "\n",
      "    Parameter names mapped to their values.\n",
      "\n",
      "<hr>\n",
      "\n",
      "*predict(X)*\n",
      "\n",
      "Predict class labels for X.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `X` : {array-like, sparse matrix}, shape = [n_samples, n_features]\n",
      "\n",
      "    Training vectors, where n_samples is the number of samples and\n",
      "    n_features is the number of features.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `maj` : array-like, shape = [n_samples]\n",
      "\n",
      "    Predicted class labels.\n",
      "\n",
      "<hr>\n",
      "\n",
      "*score(X, y, sample_weight=None)*\n",
      "\n",
      "Return the mean accuracy on the given test data and labels.\n",
      "\n",
      "In multi-label classification, this is the subset accuracy\n",
      "which is a harsh metric since you require for each sample that\n",
      "each label set be correctly predicted.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `X` : array-like of shape (n_samples, n_features)\n",
      "\n",
      "    Test samples.\n",
      "\n",
      "\n",
      "- `y` : array-like of shape (n_samples,) or (n_samples, n_outputs)\n",
      "\n",
      "    True labels for X.\n",
      "\n",
      "\n",
      "- `sample_weight` : array-like of shape (n_samples,), default=None\n",
      "\n",
      "    Sample weights.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `score` : float\n",
      "\n",
      "    Mean accuracy of self.predict(X) wrt. y.\n",
      "\n",
      "<hr>\n",
      "\n",
      "*set_params(**params)*\n",
      "\n",
      "Set the parameters of this estimator.\n",
      "\n",
      "The method works on simple estimators as well as on nested objects\n",
      "(such as pipelines). The latter have parameters of the form\n",
      "``<component>__<parameter>`` so that it's possible to update each\n",
      "component of a nested object.\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `**params` : dict\n",
      "\n",
      "    Estimator parameters.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `self` : object\n",
      "\n",
      "    Estimator instance.\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "with open('../../api_modules/mlxtend.classifier/OneRClassifier.md', 'r') as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
